import pandas as pd
# fuente: https://www.kaggle.com/datasets/ashydv/housing-dataset/code
df = pd.read_csv('Housing.csv')
df.head()
| price | area | bedrooms | bathrooms | stories | mainroad | guestroom | basement | hotwaterheating | airconditioning | parking | prefarea | furnishingstatus | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 13300000 | 7420 | 4 | 2 | 3 | yes | no | no | no | yes | 2 | yes | furnished |
| 1 | 12250000 | 8960 | 4 | 4 | 4 | yes | no | no | no | yes | 3 | no | furnished |
| 2 | 12250000 | 9960 | 3 | 2 | 2 | yes | no | yes | no | no | 2 | yes | semi-furnished |
| 3 | 12215000 | 7500 | 4 | 2 | 2 | yes | no | yes | no | yes | 3 | yes | furnished |
| 4 | 11410000 | 7420 | 4 | 1 | 2 | yes | yes | yes | no | yes | 2 | no | furnished |
A Bar graph is used when you want to show a distribution of categorical variables or ordinal subgroups of your data. From a bar chart, we can see which groups are highest or most common, and how other groups compare against the others.
# Bar chart
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="darkgrid")
total = float(len(df))
ax = sns.countplot(x="bedrooms", data=df)
for p in ax.patches:
percentage = '{:.1f}%'.format(100 * p.get_height()/total)
x = p.get_x() + p.get_width()
y = p.get_height()
ax.annotate(percentage, (x, y),ha='center')
ax.set(title="Number of Bedrooms in Houses", ylabel="%", xlabel= "nº bedrooms")
plt.show()
A Small Multiple is a data visualization that consists of multiple charts arranged in a grid. This makes it easy to compare the entirety of the data.
# Small Multiple
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df=pd.DataFrame({'x': range(1,11), 'y1': np.random.randn(10), 'y2': np.random.randn(10)+range(1,11), 'y3': np.random.randn(10)+range(11,21), 'y4': np.random.randn(10)+range(6,16), 'y5': np.random.randn(10)+range(4,14)+(0,0,0,0,0,0,0,-3,-8,-6), 'y6': np.random.randn(10)+range(2,12), 'y7': np.random.randn(10)+range(5,15), 'y8': np.random.randn(10)+range(4,14), 'y9': np.random.randn(10)+range(4,14) })
palette = plt.get_cmap('Set1')
num=0
for column in df.drop('x', axis=1):
num+=1
plt.subplot(3,3, num)
plt.plot(df['x'], df[column], marker='', color=palette(num), linewidth=1.9, alpha=0.9, label=column)
# Same limits for every chart
plt.xlim(0,10)
plt.ylim(-2,22)
# Not ticks everywhere
if num in range(7) :
plt.tick_params(labelbottom='off')
if num not in [1,4,7] :
plt.tick_params(labelleft='off')
# Add title
plt.title(column, loc='left', fontsize=12, fontweight=0, color=palette(num) )
# general title
plt.suptitle("How top 9 cryptos evolved\nthese past few days?", fontsize=13, fontweight=0, color='black', style='italic', y=1.02)
# Axis titles
plt.text(0.5, 0.02, 'Time', ha='center', va='center')
plt.text(0.06, 0.5, 'Note', ha='center', va='center', rotation='vertical')
# Show the graph
plt.show()
import pandas as pd
import holoviews as hv
from holoviews import opts, dim
from bokeh.sampledata.les_mis import data
hv.extension('bokeh')
hv.output(size=200)
The Chord element allows representing the inter-relationships between data points in a graph. The nodes are arranged radially around a circle with the relationships between the data points drawn as arcs (or chords) connecting the nodes. The number of chords is scaled by a weight declared as a value dimension on the Chord element.
If the weight values are integers, they define the number of chords to be drawn between the source and target nodes directly. If the weights are floating point values, they are normalized to a default of 500 chords, which are divided up among the edges. Any non-zero weight will be assigned at least one chord.
The Chord element is a type of Graph element and shares the same constructor. The most basic constructor accepts a columnar dataset of the source and target nodes and an optional value. Here we supply a dataframe containing the number of dialogues between characters of the Les Misérables musical. The data contains source and target node indices and an associated value column:
links = pd.DataFrame(data['links'])
print(links.head(3))
source target value 0 1 0 1 1 2 0 8 2 3 0 10
In the simplest case we can construct the Chord by passing it just the edges
hv.Chord(links)
The plot automatically adds hover and tap support, letting us reveal the connections of each node.
To add node labels and other information we can construct a Dataset with a key dimension of node indices.
Additionally we can now color the nodes and edges by their index and add some labels. The labels, node_color and edge_color options allow us to reference dimension values by name.